Finance Data Capstone Project

Focus on exploratory data analysis of stock prices, meant to practice my visualization and pandas skills, it is not meant to be a robust financial analysis or be taken as financial advice.

Posted by Afdhal Afgani on February 1, 2021

Finance Data Capstone Project is my own exercise project from Udemy Python for Data Science and Machine Learning Bootcamp by Jose Portilla. In this project, In this data project I will focus on exploratory data analysis of stock prices.

For this capstone project I will be analyzing stock prices data from Google Reader, unfortunately I can't access Google Reader from my computer, so I will use this data instead. This data contains stock information from following bank:

  • Bank of America
  • Citigroup
  • Goldman Sachs
  • JPMorgan Chase
  • Morgan Stanley
  • Wells Fargo

For each bank data contains the following fields:

  • Open, double precision float
  • High, double precision float
  • Low, double precision float
  • Close, double precision float
  • Volume, integer

Project Intro/Objective

The purpose of this project is meant to practice my visualization and pandas skills, it is not meant to be a robust financial analysis or be taken as financial advice.

Project Library

  • Numpy
  • Pandas
  • Matplotlib
  • Seaborn
  • datetime
  • matplotlib.plyplot
  • Plotly
  • Cufflinks

Data and Setup

In this section, I want to show some of the data information, I will use .head() to see all dataset and Bank of America data (BAC).

Exploratory Data Analysis

From all banks data, I want to see the max Close price for each bank's stock throughout the time period.

Then I create a new empty DataFrame called returns. This dataframe will contain the returns for each bank's stock.

For further study, I can use pairplot to see how each bank performance related to each other:

For better understanding, we look to all banks return data. We looking for minimum, maximum, and standard deviation from each bank.

Returns Minimum
Returns Maximum
Returns Standard Deviation
Returns 2015 Standard Deviation

Then I try to look for Morgan Stanley 2015 and CitiGroup 2008 distribution plot.

Data Visualization

In this section, I try to visualize some financial analysis. First, I have to import the required modules, then try to visualize close price for each bank for the entire index of time (2008 - 2016). First, I try to create simple line plot of all banks close price using for loop and .xs then try to see if there is any difference between this two plot.

I can see, using for loop and using .xs resulting the same plot, the difference is I can use shorter code while using .xs than using for loop.

After that, I create line plot using iplot, the difference is I can create an interactive graph while using iplot, where I can get the data directly when I move our cursor to certain line.

Then, I try to create a moving averages plot using rolling with window = 30 and create the corrleation data frame using Close price as key value.

After created correlation data frame, I try to plot HeatMap and ClusterMap using simple plot to makes easier for me to look the correlation between our data.

After created Heatmap using simple plot, I try to create Heatmap using iplot and see the difference.

The difference between simple plot and iplot is when using iplot, I can create an interactive plot where I can see the value for certain position using our pointer, and simple plot just showing the plot and does not create an interactive plot.

Furthermore, I try to create some advance financial analysis technic. I try to plot using iplot to get an interactive plot.

First, I create Bank of America candlestick for the year 2015. Using iplot and kind as 'candle', I get candlestick plot like this:

Second, I create Morgan Stanley simple moving averages for the year 2015. I use .ta_plot(study='sma') and periods [13, 21, 55]:

Third, I create Bank of America Bollinger Band Plot for the year 2015. I use .ta_plot(study='boll'):

Conclusion

  • Google Reader performance varies within countries
  • Almost all banks sinks stocks on Inauguration Day (2009-01-20)
  • Citigroup got minimum returns on reverse stock split day (2011-05-06)
  • Simple line plot using .xs has shorter code than using for loop but got the same result
  • iplot is more interactive than simple plot
  • We can use .ta_plot() for some financial analysis

Additional Resources

  • Header Backgrounds by Wallpaper Flare at wallapaperflare.com
  • For further explanation regarding python code, please kindly check this link.